English Language Teaching; Vol. 6, No. 10; 2013 
ISSN 1916-4742 E-ISSN 1916-4750 
Published by Canadian Center of Science and Education 


Distribution of Articles in Written Composition among Malaysian ESL 

Learners 

Mia Emily Abdul Rahim 1 , Emma Marini Abdul Rahim 1 & Chia Han Ning 1 

1 Department of Language and Humanities Education, Faculty of Educational Studies, Universiti Putra Malaysia, 
Malaysia 

Correspondence: Mia Emily Abdul Rahim, Department of Language and Humanities Education, Faculty of 
Educational Studies, Universiti Putra Malaysia, 43400 UPM, Serdang, Selangor Darul Ehsan, Malaysia. E-mail: 
mia.emily@yahoo.co.nz 

Received: May 13, 2013 Accepted: July 2, 2013 Online Published: September 4, 2013 
doi:10.5539/elt.v6nl0pl49 URL: http://dx.doi.org/10.5539/elt.v6nl0pl49 


Abstract 

The study aimed to investigate the distribution patterns of the English grammar articles {a, an , and the) as well as 
the distributions of their colligation patterns in written compositions of English among Malaysian ESL learners. 
This paper reports the results of a corpus-based study on articles used by these learners. The method used in this 
study was quantitative content analysis, in which it utilized the data from the Malaysian Corpus of Students’ 
Argumentative Writing (MCSAW). The findings indicated that all the three articles were presented throughout the 
entire compositions and that the frequency of occurrences made up 18 percent of the entire tokens. The findings 
also showed that the distributions of the colligation patterns of the articles were at variance. Some of the colligation 
patterns were used heavily while others were not often applied in the compositions. The findings can help language 
teachers identify areas in the structure that need further emphasis in their teaching. 

Keywords: grammar articles, composition, distribution patterns, colligation patterns, corpus-based study 
1. Introduction 

1.1 Statement of the Problem 

In the English language system, articles are commonly used with nouns to provide more information about the 
nouns and to indicate its grammatical definiteness. Along with some other common grammatical facets of the 
English Language system, the use of articles is one of the most frequently used components of grammar in the 
English language (Mukundan, Leong & Nimehchisalem, 2012, p. 62). However, it is also classified as one of the 
most difficult grammatical components to be properly learned and acquired, especially to non-native speakers of 
English or the L2 learners. Researchers have found this to be highly noted in most findings in lexical-related—and 
more recently, discourse levels studies. Kim (2006, p. 1) argues that L2 learners often encounter difficulties as the 
English article system is, in most cases, known to pose learnability issues. This is made even worse a phenomenon 
as Master (2002) highlighted that English articles are often featured as substantial elements in error analysis even 
among advanced L2 learners who possess strong command in both spoken and written English. 

It is widely known and believed that the articles a, an and the have been exposed to learners at an early stage of L2 
learning. This is the case in the Malaysian learning context, where students have been taught articles at as young as 
the age of 4 in kindergarten or preschool education. Suhaila Mokhtar (2002), as mentioned in Mukundan, Leong & 
Nimehchisalem (2012, p. 62) stated that “although Form 1 to Form 5 learners are exposed to articles, they still 
commit frequent errors in using them in sentences”. Unfortunately, the difficulties are not only encountered by 
these L2 learners alone. Teachers, too, have great difficulties in teaching English articles effectively, and are still 
figuring the best ways to teach them, especially ESL teachers who “find it difficult to understand how or why their 
students choose to use articles in the ways that they do” and are still puzzling over effective ways of teaching the 
article system (Butler, 2002, p. 452). 

Analyzing the distribution and colligation patterns of articles in the written composition of Malaysian ESL learners 
would give insights from the perspective of the learners which are highly valuable to the teachers. It would provide 
them with an understanding of the usage of articles by the learners, which will assist teachers to draw explanations 
as to why errors or mistakes are committed in using them. It also provides a great insight in the contextual use of 
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the language especially in relation to language produced by the second language learners. Studies in this area are 
still lacking and should be considered as one of the neglected aspects in the educational field in Malaysia where 
much concerns are being put on the development of language input. Therefore, teachers can benefit by measuring 
learners’ ability in using, and also their comprehension of the English article system to be able to develop plans on 
the handling of the subject matter. Teachers should acknowledge that knowing the heart of the problem itself is the 
fundamental basis of tackling the situation. 

1.2 Objectives of the Study 

This study is aimed at investigating the usage patterns of articles in written compositions among Malaysian ESL 
learners. The specific objectives of the study are: 

1) To investigate the distribution of articles in written compositions among Malaysian ESL learners. 

2) To investigate the colligation patterns of articles according to word classes in written compositions among 
Malaysian ESL learners. 

1.3 Research Questions 

Based on the research objectives mentioned, the following research questions were devised: 

1) What are the distribution patterns of articles in written compositions among Malaysian ESL learners? 

2) What are the colligation patterns of articles according to word classes in written compositions among Malaysian 
ESL learners? 

2. Literature Review 

Corpus-based studies are relatively new in the Malaysian educational arena. Krieger (2003) mentioned that corpus 
linguistics is one particular area on the computer frontier that is yet to be fully explored. Apart from being greatly 
beneficial, corpus linguistics is increasingly seen as crucial nowadays as it can be applied to various aspects of 
language teaching and learning, especially the teaching and learning of English as a second or foreign language. 
Barlow (2002) listed three areas in which corpus linguistics can be applied in teaching, namely syllabus design, 
materials development, and classroom activities. However, the use of corpus linguistics is not only limited to 
those. 

In Malaysia, very few English Language corpora have been compiled and developed. The most recent work was 
created by Mukundan & Rezvani Kalajahi (2013), compiling Malaysian ESL learners’ argumentative writings 
called MCSAW. Another corpus was developed by Mukundan & Anealka (2007) which comprises of English 
Language textbooks being used in Malaysian secondary schools. Additionally, Menon (2009) compiled textbook 
corpora including the English for Science and Technology (EST) and science textbooks. Studies based on these 
corpora have been tremendously done but are collectively limited. Apart from these, there are the English 
Language of Malaysian school corpus (Arshad et al., 2002), and Corpus Archive of Learner English 
Sabah-Sarawak called CALES (Botley, De Alwis, Metom & Izza, 2005). 

It is said that the English article system is one of the most abundant in terms of its usage in the language, yet it is 
also the trickiest to be taught and learned by non-native English speakers (Yamada & Matsuura, p. 50). As Alimi 
(2007) pointed out, articles are complex grammatical structures. This statement is particularly true for most Asian 
students. For example, Yoshii & Milne (1999) noted that almost all the Japanese and Chinese students face 
difficulties in using articles “...since they do not have articles in their languages...and, thus, cannot accurately 
reproduce them from sentences they hear.” They also pointed out that the article-related concepts are too 
ambiguous and too vague to be applied to real-life situations. Researchers alike stress that the lack of an article 
system in the Korean language poses challenges to ESL/EFL learners in Korea and they would commonly resort to 
omission of obligatory articles in both their spoken and written English (Kim, 2006; Park, 2006). 


Table 1. Word classes that commonly colligate with grammar articles 


Item 

Article 

Example 


English article ‘a ’ 


Structure 1 (SI) 

a + singular count nouns 

a book 

Structure 2 (S2) 

a + words spelled with a vowel but pronounced with a 
consonant sound 

a university 

Structure 3 (S3) 

a + adjectives 

a small book 
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English article ‘an ’ 


Structure 4 (S4) 

an + vowel sound singular count noun 

an apple 

Structure 5 (S5) 

an + words spelled with a consonant but pronounced with a 
vowel sound 

an hour 

Structure 6 (S6) 

an + vowel sound adjective 

English article ‘the ’ 

an oval table 

Structure 7 (S7) 
Structure 8 (S8) 
Structure 9 (S9) 
Structure 10 (S10) 

Structure 11 (S11) 

the + singular noun 
the + plural noun 
the + noncount noun 
the + proper noun 

the + superlative 

the table 

the tables 

the furniture 

the Niagara falls 

the prettiest / the most 

expensive 

Structure 12 (S12) 

the + ordinal 

the first / the second 

Structure 13 (SI3) 

the + adjectives 

the naughty boy 

Structure 14 (SI4) 

the + ‘only’ 

the only person 

Structure 15 (SI5) 

the + expression of time 

in the morning 


*Adapted from Mukundan, Leong & Nimehchisalem (2012, p.65) 


There are 15 items in the structure which define the common colligations of word classes with the articles a, an and 
the, as illustrated in Table 1 above. Colligation of articles a, an, and the with word classes were divided into 
specific categories based on a structured list adapted from Mukundan, Leong & Nimehchisalem (2012), as 
proposed by Celce-Murcia and Larsen Freeman (1999). Stubbs (2001) stated that large amount of language use are 
comprised of words occuring in conventional combinations and distinguished them as central characteristic of 
language in use. This proves that predictable assemblage of lexis constitutes a hugeproportion of normal language 
use. Colligation of words are common in nature, and thus articles, being one of the most used elements of grammar 
in English, is in no exception at being ones that colligated most regularly with. 

Cowie (1998) argued that there are always tendency of words to occur in preferred sequencing, and described them 
as one linguistic paradigm referred to as ‘phraseology’ which can be applied to phenomena including word 
combinations, collocation and prefabricated and formulaic expressions. The term ‘colligation’ is oftenly used to 
refer to combination of lexis and grammar (Tognini-Bonelli, 2001; Floey, 2005). In earlier work, Sinclair (1992) 
also mentioned different forms of a lemma or different word classes of a word have clearly distinct colligational 
preferences. Floey (1993) for example investigated the circumstances of the performances of signalling functions 
in written text. It showed how particular collocations and colligations are associated with particular word functions 
by focusing on one signalling word reason. Yamasaki (2008) later studied how collocational and colligational 
behaviour of anaphoric nouns differentiate their discourse functions within specific contexts using a large-scale 
corpus, and this further adds to the emphasis of evidence that “types of words and grammatical categories favoured 
or avoided by a particular word or word sense vary considerably according to contextual usage and language 
variety”. 

3. Methodology 

This research employs the corpus-based analysis as its tool to study the distribution of articles in written 
compositions among Malaysian ESL learners. It is proposed that “corpus based analysis is an ideal tool to 
reevaluate the presentation of linguistic features in textbooks and to make principled decisions about what to 
prioritize” (Barbieri & Eckhardt, 2007, in Philip, Mukundan & Nimehchisalem, 2012, p. 3). Computer-aided 
content analysis method is applied in this study to examine the frequency of articles found in the written 
composition, its distribution patterns and colligation patterns of articles according to word classes in the corpora. 

3.1 Sample 

This study utilizes data from the Malaysian Corpus of Students’ Argumentative Writing (MCSAW) Version 1 that 
comprises of 296 essays from Form 4 students, 274 essays from Form 5 students and 440 essays from college 
students. For the purpose of this study, only 50 essays from each group level of students were used as the samples 
which make a total of 150 essays. In relation to the distribution of word size, college students contributed the 
highest portion of 12807 tokens, followed by Form 4 students of 9932 tokens, while Form 5 students make up 9340 
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tokens from the total of 32079 running words. Mukundan and Rezvani Kalajahi developed this corpora for the 
purpose of establishing baseline data of the English language proficiency of Malaysian students in writing, 
providing benchmarks of the learners’ language proficiency and examining the language developmental patterns of 
the learners across three age range and educational levels which are Form 4, Form 5 and college level students 
(Mukundan & Rezvani Kalajahi, 2013). 

3.2 Instrumentation 

This study uses a concordance software called Oxford WordSmith Tools version 4.0 developed by Michael Scott 
(1996, 1997, 1999) as its main instrument to analyze the data from MCSAW. Many researchers deemed this 
software to be the most appropriate tool to be used for corpus analysis (Mukundan, 2009; Mukundan & Menon, 
2006; Mukundan & Roslim, 2009). WordSmith Tools serve the purpose of looking at how words behave in texts. 
There are three major components in this software, namely Concord Tools, Keywords Tools and WordList Tools, 
but only two of which were used for the purpose of this study. In order to retrieve the word clusters from the corpus, 
the Concord Tools was opted, while WordList Tools was used to retrieve the frequency of articles used in the 
written composition. 

4. Results and Discussion 

4.1 Distribution of Articles 

All the three articles {a, an, the) occur throughout all the selected written compositions obtained from MCSAW. 
The frequency of occurrences of each of these articles is determined in this study to obtain the number of times 
these articles were used by the students in their writings. There were a total of 5783 articles out of 32079 tokens 
used in the compositions. This shows that articles a, an and the made up 18% of the entire running words written 
by the students. 


Table 2. Distribution of articles a, an, and the in 150 written composition among Malaysian ESL learners 


Articles 

Frequency 

Percentage 

A 

1021 

17.7% 

An 

442 

7.6% 

The 

4320 

74.7% 

Total 

5783 

100% 


Table 2 shows the frequency of usage of grammar articles out of 150 samples with 5783 tokens. Looking at the 
percentages, the article the made up 74.7% of the frequency of occurrence of grammar articles in the compositions. 
The article a made up 17.7% while the least is an with a percentage of 7.6%. Specifically, the article the had the 
highest usage of 4320 times followed by the article a with a frequency of 1021 times, and lastly the article an 
which occurred only 442 times. 

Therefore, analysing from these data, it is clear that the article the was highly used by the students. The article an 
on the other hand was used the least, while the article a was moderately used by the students throughout the 
compositions. The high frequency of usage of the articles a, an and the points out that the articles were heavily 
used in the language, particularly in students’ writing. This indicates that the grammar articles are not avoided or 
omitted in writings, hence, mastery of the subject matter is a must for the students. 

This finding further supports the previous study by Mukundan, Leong & Nimehchisalem (2012) which claimed 
that the article system is one of the heavily used aspects of grammar in the English language. Extensive use of 
definite and indefinite articles show that the article system is one of the basics and fundamentals in grammar. This 
is especially true in writing, where accurate use of the article system provides an insight to the comprehension level 
of the students in grammar use. Master (1994) revealed that articles were not often presented as hindrance towards 
getting the message through or being intelligible by means of speaking since other linguistic features can 
substantiate omission of articles and should as well provide naturalistic use of English, but in written language, 
precision and accuracy on articles portray the writer’s fluency. 

According to Chin (2000), it is strongly suggested that “the most beneficial way of helping students improve their 
command of grammar in writing is to use students' writing as the basis for discussing grammatical concepts”. 
Based on the findings in this study, frequency of use of articles in students’ writing shows the importance of articles 
as one of the fundamental aspects of English grammar. Thus, students require mastery in this particular component 
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of grammar in order to pursue for comprehension in many other aspects in grammar, to say the least. Chin (2000) 
further stressed that teachers should give more attention to grammatical concepts that are necessary for clear 
communication of meaning in students’ writing. She added that teachers should prioritize and teach grammatical 
elements that most affect their students’ ability to write effectively rather than being over-ambitious by striving to 
teach all. 

Another strong view is from Weaver (1998) who proposed that guidance in understanding and applying grammar 
items that are most relevant to writing is all that is needed the most by the students. Frequencies (in this case, the 
English articles a, an and the) should be of awareness by the teachers to help determine which grammar elements 
to be prioritized in their teaching (Conrad, 2000). Furthermore, Philip, Mukundan & Nimehchisalem (2012) 
emphasized that without adequate knowledge of the grammar system, students would not be competent enough, 
hence, understanding of grammar of the language is necessary in order to function well in the language. 
Consequently, from the findings of this study, it is best recommended that teachers give more priority to the 
teaching of the grammar articles as they are extensively used in students’ writing. This is to help them to write 
better and more confidently because articles are presented in every type of written compositions, and whether 
students are aware of it or not, they have to use articles in their writings. 

4.2 Colligation Patterns of Articles According to Word Classes 

Table 3 shows the frequency of occurences of colligation of a according to specified word classes, which indicates 
irregularity in its usage. A very imbalance proportion pattern can be seen in other colligations according to their 
word classes as well. As presented in Table 4, the colligation of an with associated word classes does not have a 
well-balanced proportion. Table 5 indicates the use of the article the, where S7 and S8 contain the highest numbers 
of occurrences. 


Table 3. Frequency of occurrences of article a in colligation patterns according to word classes 


Article Structures 

Frequency 

SI 

755 

S2 

21 

S3 

245 


Table 4. Frequency of occurrences of article an in colligation patterns according to word classes 


Article Structures 

Frequency 

S4 

298 

S5 

41 

S6 

103 


Table 5. Frequency of occurrences of article the in colligation patterns according to word classes 


Article Structures 

Frequency 

S7 

1068 

S8 

2254 

S9 

340 

S10 

138 

Sll 

111 

S12 

75 

S13 

286 

S14 

17 

S15 

31 


This study revealed the distribution patterns of article colligation with word classes in students’ written 
compositions based on its frequency. It can be drawn from the findings that students’ use of articles varied across 
different types of colligations. The number of occurrences of colligation according to particular word classes was 
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inconsistent. There are some colligations patterns that have very low number of occurences, while some are 
heavily used. These support the fact that there are no free-choice vocabularies in English sentences but rather 
governed by constraintson the concurrence of words. Stubbs (2001) in a study of phraseology ofEnglish addressed 
that the freedom to combine words in text is much more restricted than often realised. 

According to Yamasaki (2008), one linguistic paradigm is that words tend to occur in preferred sequences. This 
gives an indication to the justification of the irregularity of the patterns in each group of article structures. 
Colligation of words often failed to be addressed thoroughly in lessons. Students’ exposure to words and the art of 
combination are often limited, hence there are lack of varieties in the choice of words and the use of appropriate 
articles in natural context. Romer (2004) highlighted a problem faced in the teaching of English to L2 learners 
which is learner input, whereby educators often fail to address the input pupils actually get in their lessons, when so 
much stressed is put on learner output. There is lack of exposure to variousness of colligation patterns to these 
students in their activities and materials in the classrooms. 

One dimension worthy to note is that there are many words in English language that carry several meanings, or in 
other term, ambiguous. Studies particularly in the field of corpus-based revealed that different unspecific nouns 
such as problem, reason, idea, and fault differ in their favoured syntactic patterns and in the favoured premodifiers 
used in each pattern (Yamasaki, 2008). From the data collected and the findings, it can be drawn that learners are 
often confused, or even worst, cannot interpret the exact meaning of these words in context they are being used. 
Therefore, these learners are not able to properly colligate the words with the right articles. Cases of article misuse, 
avoidance and omission hence could be seen. One obvious case taken from the data in the corpus is the use of 
article a and the before the word Facebook, for example ‘7 have Facebook account ’, ‘the Facebook is a social 
networking website’ and ‘a Facebook can help me with 

5. Conclusion 

With the results of the study, English teachers are better equipped to recognize the most common structures used by 
students in writing, hereby enabling them to further employ the knowledge for practical application in the 
classroom where they can better assist students in the mastery of articles. By observing the sentence structures 
students use in their writing, teachers are able to have clearer perceptions on how students comprehend and apply 
articles in compositions. 

For Malaysian teachers and researchers in particular, the findings help provide insight to the comprehension and 
practice of Malaysian students in the use of articles, aiding in the research of grammar as well as ESL teaching in 
Malaysian education. It will also help in the development and improvement of teaching materials used in the 
teaching and learning ofEnglish as well as to help teachers design activities that are more practical and relevant to 
the needs of these learners. As suggested by Lawson (2001), only a corpus can provide clear perceptions of certain 
linguistic features in real-life applications such as lexico-grammatical associations. This study should help 
teachers and researchers gain clearer comprehension on the practice of the use of articles by Malaysian students 
who are now, based on current educational curriculum, are moving towards the notional-functional approach in 
the language learning. 

Teachers are advised to emphasize more on the aspects that should be prioritized, such as to expose students to 
language use in context. Materials and activities designed for the students should include all sorts of article 
colligations according to different types of word classes, not just specific ones. These can help learners to 
familiarize themselves to the variety of colligation patterns in English language. More practices should be given to 
the students in the use of words according to the correct use of articles they should colligate with. Exposure is the 
key for the learners to learn better, hence teachers should expose these students to different types of colligation 
patterns throughout the whole learning process of the English language, not just particularly in specific grammar 
lessons. Thornbury (2002) claimed that ability to remember and understand the meanings and functions of words 
in a language are better achieved if they are met at least seven times. Apart from that, ability of these students to 
distinguish one noun from another is also crucial and therefore extra practices to educate students on the different 
types of nouns, their meanings and their usage would help in enabling them to use the appropriate article before 
nouns. Same stress should also be put on other grammatical items in the language. 

The study also brings to light the use of corpus for research on the grammatical field of articles. On top of that, the 
study highlights the importance of the understanding of article use in context. The use of corpus linguistics in 
education, in this matter, the teaching ofEnglish as a second language, is deemed to be vital as it explores most of 
the detailed aspects of the language produced by L2 students. The findings of the study provide recommendations 
to help determine areas for further research by language teachers and researchers alike regarding the use of articles 
and their colligation patterns. Corpus linguistics may enable teachers with new data on etymology and definitional 
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aspects of words being used and produced by learners particularly in regards to linguistics and sociolinguistics 
respectively. 

Further research is required to determine the competency of Malaysian secondary and college students in their use 
of articles. An error analysis may be conducted with the same corpus used in this study, in which aided with its 
findings will shed more light as more improvements are needed to bring students as well as language teachers to a 
higher level of comprehension and competence regarding articles. Investigations on the extent to which exposure 
is given to the students regarding the variety of the usage of articles and the colligations in materials and activities 
may be helpful to further analyze the distribution patterns of articles not only in writings, but other components as 
well. 
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